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Abstract 

We consider the rewiring of a bipartite graph using a mixture of random and 
preferential attachment. The full mean field equations for the degree distribution 
and its generating function are given. The exact solution of these equations for all 
finite parameter values at any time is found in terms of standard functions. It is 
demonstrated that these solutions are an excellent fit to numerical simulations of 
the model. We discuss the relationship between our model and several others in 
the literature including examples of Urn, Backgammon, and Balls-in-Boxes models, 
the Watts and Strogatz rewiring problem and some models of zero range processes. 
Our model is also equivalent to those used in various applications including cultural 
transmission, family name and gene frequencies, glasses, and wealth distributions. 
Finally some Voter models and an example of a Minority game also show features 
described by our model. 

1 Introduction 

One of the the most important classes of complex network models are those with a 
constant number of edges which evolve by rewiring those edges. The classic example of 
Watts and Strogatz [lj is of this type and such models are often studied in their own 
right [21 [3j HI O El [7]. Network rewiring is also related to to some multi-Urn models 
[H [91 [101 [11] which include what are termed Backgammon or Balls-in-Boxes models [12] 
used for glasses [EUtll], simplicial gravity [15] and wealth distributions [16]. Models of the 
zero range process [T71 [HI [19] are also closely linked. Since most practical systems cannot 
grow indefinitely networks of constant size have many applications: the transmission of 
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cultural artifacts such as pottery designs, dog breed and baby name popularity [2Ql EH 
1221 [23j EH EE] , the distribution of family names in constant populations [26] , the diversity 
of genes [27] EE]- Aspects of the Voter model [29], [30] , as used to describe the competition 
between languages [51] . and the popularity of minority game strategies [32] may also be 
cast in terms of network rewiring. 

Analytic results for network models are limited. A typical approach starts from the 
master equations for the evolution of the degree distribution, these are given in a mean 
field approximation in which the quantities are the average values of many possible re- 
alisations. Luckily in most models, and even in some real-world applications, the results 
from mean field equations often agree extremely well with numerical simulations of the 
model. 

Despite the simplifications brought by the mean field approximation, the equations 
remain difficult to solve and it is normal to study the large graph and long time limit 
e.g. see [33] IM] . For instance the finite size/time corrections to growing graphs using 
linear degree attachment probabilities are complicated and known only as an asymptotic 
expansion, for example see [35] [36] . In fact one of the most tractable examples remains 
the Erdos-Reyni random graph which can be seen as the long time limit of the Watts 
and Strogatz rewiring model [T]. 

What we show in this paper is that the mean field equations for the degree distribution 
of non-growing rewiring models with linear rewiring probabilities can be solved exactly 
for any time. This goes beyond the results found in the literature (typically exact only 
for infinitely large systems in equilibrium) and extends the initial work on exact results 
in such models at equilibrium [37] and the preliminary non-equilibrium results of [35] . 

We start by setting up the model and the mean field master equations for the degree 
distribution in the next Section. We solve these in terms of the generating function 
in Section [3] and from this we consider the degree distribution in Section H] and then 
its moments in Section [5] All this is done in terms of our simple network rewiring 
model but in Section [6] we consider the relation between this model and a variety of 
other abstract models (with and without explicit networks) and to various real world 
examples. Finally we summarise our conclusions and add some observations on how such 
preferential attachment may arise naturally and the scaling properties of our model. 

2 The Model 

We will focus on a generic rewiring problem, which we shall describe in terms of a bipartite 
graph of E 'individual' vertices, each having one edge fixed to any one of N 'artifact' 
vertices, as shown in Fig. [T] Our naming of the vertices reflects our previous work and 
one possible application (cultural transmission) but apart from the names we will keep 
our presentation abstract until Section [6] 

Each individual vertex is always connected to exactly one edge while the other end 
of each edge is connected to any artifact. The network changes by rewiring the artifact 
end of these edges and we will focus on the degree distribution of the artifact vertices at 
any one time, n(k,t) and its probability distribution p(k,t) = n(k,t)/N where k is the 
degree of an artifact vertex. 
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Figure 1: The bipartite graph has E 'individual' vertices, each with one edge. The other 
end of the edge is connected to one of N 'artifact' vertices. If the degree of an artifact 
vertex is k then this artifact has been 'chosen' by k distinct individuals. At each time 
step a single rewiring of the artifact end of one edge occurs. An individual is chosen 
(number 3 here) with probability Ur which gives us the departure artifact (here B). At 
the same time the arrival artifact is chosen with probability ILa (here labelled A). After 
both choices have been made the rewiring is performed (here individual 3 switches its 
edge from artifact B to A). 

To make progress we make further simplifying assumptions. First we will assume 
that the population of individuals is absolutely constant so E is fixed and finite. Almost 
all other comparable work uses a large E approximation. We will also assume that the 
artifact choices available are fixed to be N so the average degree of an artifact vertex is 
(k) = E/N. An important limit is where we take N to infinity so (k) — > 0. 



We will then assume that at each time step one edge is rewireco Continuous time 
evolution is considered in Section 16.11 At each time stepn we first make two choices and 
only then do we change the network. 

First an individual is chosen in some stochastic manner. This individual is attached 
by one edge to an artifact, the departure artifact. It is the artifact end of this edge which 
is to be changed. Thus we are effectively removing an edge from the departure artifact 
chosen with probability^ TI^. The edge chosen is going to be rewired and attached to 
another artifact vertex, the arrival artifact, picked with probability 11^. Thus the master 

1 Alternatively the order in which each individual changes their choice could be made in a more 
systematic way, either in a fixed order or a random order changed once each individual has been rewired 
once. They could even be changes at the same time, the model used in [2TJ H31 121] ■ We have tried 
these variations numerically and they appear to make little difference to the equilibrium results. 

2 The physical time T of each event ( £ Z should be monotonically increasing T(i + 1) > T(t) but 
otherwise it can be arbitrary. This relationship should be derived from the actual frequency of changes 
in the problem of interest. 

3 Alternatively we also have models where the process chooses an edge or artifact directly, with the 
probability Hr. The distinction is immaterial for the degree distribution of the artifacts so we shall take 
these alternatives for granted. 
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equation for the degree distribution in the mean field approximation is 



n(k,t + 1) - n(k,t) 
= n(k + l,t)U R (k + l,t)(l-U A {k + l,t)) 

-n{k, t)U R {k, t) (1 - U A {k, t)) - n{k, t)U A {k, t) (1 - U R {k, *)) 

+n(k - 1, t)U A (k - 1, t) (1 - Il R (k -l,t)), (E > k > 0) . (1) 

For notational simplicity we choose to set n(k) = H R (k) = H A (k) = for the unphysical 
values k — — 1 and k — (E + 1). In this way the equation gives the correct behaviour at 
the physical boundary values of k = and k = E. 

Note that there is a chance (n^n^) that we will choose the same artifact vertex for 
both attachment and removal. As this produces no change in the network we must ensure 
that such events do not contribute to changes in the degree distribution. This is the role 
of the factors of (1 — 11). Such terms are not normally found in the master equations for 
network rewiring [31 El El [7J [H [9j [11] . It is crucial that we do this otherwise we will not 
have the correct behaviour at the boundaries k = and k = E. 

Such (1 — 11) corrections will often be negligible especially for large E systems where 
the probabilities Il(fc) for any individual value of degree k may be tiny^. However there 
are important configurations in this model and in related models where even for large 
systems in equilibrium H(k) are not small for some values of k. This will be discussed in 
Section [6J] after we have obtained the explicit solution. 

The master equation (fT]) is a mean-field approximation for the evolution through our 
stochastic dynamics of the average value of the function n(k). The errors come if 

(n(k,t)f(k,t)) - (n(k,t))(f(k,t)) ^0 (2) 

where / are various combinations of 11^ and 11^. For instance if we have a model with 
attachment or removal probabilities of the form {k 13 / zp) then the problem lies with the 
normalisation as in general 

In many practical cases the fluctuations are small and the corrections to the mean- 
field results are often found to be small. For this reason the equations can be a good 
approximation even if the number of vertices or edges fluctuate provided their average 
values are constant and the variations are small. 

However, there are two special cases where equality holds in ([3]) implying that the 
mean field approximation is exact, namely when f3 = or (3 = 1. Only in these cases are 
the normalisations of probabilities constants of the motion, iV and E respectively. The 
most general choice for and satisfying these criteria is therefore 

n *=£> V-A=Pr^+P P ^ p p + Pr = l (E>k>0). (4) 

This form for means that an edge can be reattached in two ways. With probability p p , 
preferential attachment is used and the artifacts are chosen with a likelihood proportional 



4 These are the "safe situations" of large graphs discussed in Chapter 4 of [3]. 
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to their degree. Alternatively with probability p r a random^ artifact is chosen. Choosing 
a random edge for rewiring corresponds to the use of 'preferential removal' alone. 

There are other good reasons for choosing these forms for the probability apart from 
the fact the master equation is then exact. Mathematically, these simple forms enable 
us to find a complete non-equilibrium solution. In terms of practical applications, one 
may understand these special forms as emerging naturally from a random walk process 
[361 H3] . We will also note the scaling properties of their solutions in Section [7J We 
will limit our analysis to the case (J3j) which means we will present exact results for the 
ensemble average of various quantities at any time and for any values of our parameters. 



3 The Generating Function 

A useful way to investigate the degree distribution n(k, t) is to encode it with a generating 
function G(z, t) 

E 

G(z,t):=J2Mk,t). (5) 

fc=0 

Below we will exploit the fact that G is always a polynomial in z of order no greater than 
E. The mean field equations ([I]) can then be re-written as a differential equation for the 
generating function, 

Kl (1 + _ a ~ C) [G(z,t+1)-G(z,t)} 

= z(l-z)G"(z,t) + [c-(a + b+l)z]G'(z,t)-abG(z,t) , (6) 

where the differentials G" and G' are with double and single derivatives with respect to 
z. The constants a, b and c are given by, 

a=^(k), b=-E, c =l + PL(k)-—. (7) 
Pp Pp Pp 

The equation for n(k, t) is linear — it is completely equivalent to a Markov process 
in an E + 1-dimensional space in which the vector (n(0, t), n(l, t) . . . , n(E, £)) lives [38] . 
Therefore we can define E + 1 eigenvectors uj^ m \k) associated with an eigenvalue A m 
(m — 0,1,2, ... E), which we order such that X m > \ m +i- Furthermore, the properties of 
the Markov process guarantee that 1 > |A m | with at least Ao = 1. 

We can now break the generating function into E + 1 components with the time 
dependence factorised 

E E 

G(z,t) = J2cMm) t G ( ~ m \z) , & m \z) :=Y,z k u {m) (k) (8) 

m=0 fc=0 

where the coefficients c m depend on the initial conditions n(k,t = 0). Again the gener- 
ating functions for the eigenvectors, G^ m \z), are polynomials of degree no larger than 



5 In this paper 'random' without further qualification indicates that a uniform distribution is used to 
draw from the set implicit from the context. 
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E. Substituting this form into (jEJ) gives a a time-independent differential equation for 
G( m )(z), the generating function of the m-th eigenvector: 



z(l - z)G (m) "{z) + [c - (a + b + l)z}G (m) '{z) 

G im \z) = 0. (9) 



ab _ ( Am l2 fe( c _ a - 1) 



This can be solved most easily by writing G as a polynomial in (1 — z). Having the 
correct form for the master equation and therefore the correct behaviour at the boundaries 
ensures that this gives a finite order polynomial. These may be summarised in terms of 
the Hypergeometric function F = 2F1 to be 

G^ m \z) = (1- z) m F{a + m,b + m;c;z) (10) 

r(q + m + QTg + m + Qr(c) , 
r(o + m)r(6 + m)r(c + 0(i!) 

with corresponding eigenvalues, 

A m = l-m(m-l)||-m^, < m < £ . (12) 

An expression for the entries of the eigenvectors LO^ m '{k) may be derived from the coef- 
ficients of z k in fllip which can be given in terms of the Hypergeometric function 3F2, 
though it is not very illuminating and merely assists in explicit evaluations: 

w (m )(Jb) = ( ir r ( fc + !) r ( c + fc ) r( a ) r(6) x 

T(k + 1 — m) T(c + k — m) T(a + m) T(6 + m) 
3 F 2 (-m, a + k,b + k;k + l- m, k + c-m;l- z) \ z=0 uj (0) (k) (13) 

^o) ( M_ r ( Q + fr) r (fe + fc) r(c) 

1 J " r(a) r(6) r( c + fc) ' 1 > 

In the special case of p r = 1, the degree distribution is that of the Watts and Strogatz 
model pQ (see Section [6] below). The generating function then reduces to 

G {m) (z) = (1 _/ 1)g _ m (l - z) m ((l - AT 1 ) + N^z) E -™ . (15) 

More usefully we note for later use that the eigenvalues satisfjlll 

1 = A > A x > ...A m > X m+ i> ■■■Xe = ^>0, (0<p r <l) (16) 
A -1 Pr A — 1 2pr 2pp (M) 



where 



6 We have chosen to normalise the eigenfunctions such that oj^ m '(k = 0) = 1. The physical normali- 
sation needed in the problem is contained in the c rn coefficients of ([8]). 

7 The only case of eigenvalue crossing occurs at p r = when there is degeneracy in the two largest 
eigenvalues with 1 = Ao = Ai. 
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The first consequence of these solutions is that the system evolves to a unique equilibrium 
solution given by [37] 

G(z) := lim G(z,t) = c F(a, b; c; z) . (18) 

t—*oo 

The time scale for the decay of each of the eigenfunctions is given by 

x m = -l/ln(A m ). (19) 



4 The Degree Distribution 

The degree distribution n(k,t) at any time is given as the coefficients of z k in the ex- 
pression for the generating function G(z,t) of (jSJ) and this in turn depends on the initial 
conditions. The equilibrium degree distribution derived from G(z) of (|18p which from (JSJ) 
is based only on the zero-th eigenfunction a/ ' (the m = case of pip ). In particular 
the k — case shows that Co = n(0) = lim^oo n(k = 0, t) and so we have 



1 d k G{z) 



n(k) := lim n(k,t) = — 



w(O) r(a + A;)r(& + A;) T(c) 

z=0 r(* + i) r(a) r(6) r( c + k)- { } 



The total number of artifacts is given simply by the generating function at z = 1 so 

N — CqG{z = 1) = n(0)F(a, b; c; 1) . (21) 

This gives us the equilibrium artifact degree probability distribution function p(k) = 
n{k)/N as 

r (k + Er(k))r ( ^ - ^(k) - k) 

n (k) - A V ^ ') \*> p ^ 1 ) (no) 

m ~ A r(k + i) r(E + i-k) ' [ } 

r r (£ + 1) 

r fe^-W)) r few) r (i)' (23) 

where we have chosen to write the expression in terms of T functions of positive arguments 
and in terms of the original parameters. Two useful values are the degree probability 
distribution for zero degree and maximum degree k = E. The former provides a measure 
of the number of unused artifacts and thus another measure of the uniformity of the 
system and the latter will be discussed in detail below. These satisfy simple formulae 

T(—E) T(^-^(k)) 

v (0) = Kp " ' Kpp PpX " (-24) 

P(U) T(^) T(^E-^(k}) { ' 

T( p ^E)T(E + ^(k)) 



The results for p(0, t) are plotted against exemplary data in Fig. [21 
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Figure 2: Plots of p(0) as a function of re-wirings and the fractional difference between 
the simulation and mean field results for N = E = 100 and p r = 0.1 (crosses), p r = 0.04 
(circles), p r = 0.02 (stars) and p r = 0.01 (squares). Simulations started with n(k = 1) = 
E and zero otherwise. Averaged over 10 6 runs. The solid lines are the results from the 
mean field calculations. 



4.1 Large Degree Equilibrium Behaviour 

The solution for p(k) has two significant parts. The first k dependent ratio of Gamma 
functions in (T221 for k ^> 1 and p r ~ behaves as 



Rl = _^L_1 <x jfe-7 i + 0(k , t^L) , 7 = 1- —(A;) < 1 . (26) 



r(fc + i) V ' p P k J ' p P 

For p r = or (A;) = (which includes when N — > oo) this term gives us an exact 
inverse k power law for all degree k from this term. Another special case corresponds 
to an attachment probability of oc (k + 1) which is often found in the literature, for 
instance [H QUI [EE]. This ratio -Ri is then exactly one for all k so 7 = 0. In general the 
power is usually close but always less than one. 

However the (1 — Ha) and (1 — ITr) terms in ([1]) have led to the second fc-dependent 
ratio of Gamma functions in (T22]) . If E ^> k this gives an exponential cutoff 



r (— — —(k) — k 

\pp PP N ' 



R, = -—-ft) 7 - exp{ ^ }(1 + < § )} ' (27) 

C = -ln(p P ) (28) 
« p r if p r < 1, (k)<£E . (29) 

Within these approximations, this may be expressed in an equivalent manner which is 
some times seen in the literature (e.g. [281 121 



*= YVi V k H) (1+0( > (30) 



While not strictly valid at k « E this second form indicates that there is a change of 
behaviour for large degree if £ < 0. In such a case the numerator of this second k- 
dependent ratio of Gamma functions becomes very large for k = E and we see directly 
that this happens if p r <C pj, where R 2 = 1 (C = 0) at pf 

n-wrhwi s, ^ 1+omE ' 1)h (32) 

At p r = pjj we are closest to a pure power law with a power 7 = y$ = 1 — (2JV) -1 . 
For the special case where N — > 00 so (k) — > we get a perfect inverse power law at 
p r = (l + J B)- 1 . 

As p r drops below this critical value, a spike emerges at k — E from this second 
fc-dependent ratio R2 which comes to dominate the degree distribution at p r — > 0. The 
point where the distribution has become flat at the upper boundary, so n(E) = n(E — 1) 
defines an alternative critical random attachment probability p* at 

P * = E 2 + E(l-(k))-l-(k) ' (33) 
Ep* * 1 + (34) 

Either way when p r <1/E the degree distribution will show a spike at k — E. 

Overall we see two distinct types of distribution. For large random attachment rates, 
Ep r > 1, we get a simple inverse power with an exponential cutoff 

n(k) oc (k)~ 7 exp{-(k} , p r > — . (35) 

This behaviour is often noted in the literature [SI EH EH EU El El [171 EE] and the formulae 
given there for the power 7 and cutoff ( or £ are consistent with the exact formulae given 
here given the various approximations used elsewhere. Note that in any one practical 
example it will be impossible to distinguish the power 7 derived from the data from a 
value of one. This is because to have a reasonable section of power law behaviour we 
require 1 C but this implies that p r is small and so (7 — 1) <C (k). The power drifts 
away from one as we raise the random attachment rate p r towards one but only at the 
expense of the exponential regime starting at a lower and lower degree. Only when the 
power is very close to one can we get enough of a power law to be significant. 

However as p r is lowered towards zero we get a change of behaviour in the exponential 
tail around p r E m 1. First we find the exponential cutoff moves to larger and 
larger values, eventually becoming bigger than E. For p r slightly below p$, that is for 
p r > p* w E , the tail starts to rise. For p r E 1, i.e. if there has been no random 
artifact chosen after most edges have been rewired once, then we will almost certainly 
find one artifact linked to most of the individuals, n(E) w 1. 

This behaviour for Ep r <C 1 has been noted in some of the literature where it is known 
as condensation [TJl EU ESI ESI E3 EB] or, in the older population genetics literature, it is 
called fixation. It mirrors similar behaviour known for growing networks when non-linear 
attachment probabilities or vertex fitness are used, for example see 



9 



4.2 Limiting Values of p r in Equilibrium 



The p r = limit offers some simplifications as well as being the only value in the con- 
densation phase. The condensation is clear from the expression for p(k) of (|22|) as the 
denominator is infinite if (p r /p p )E(l — iV _1 ) = 0, i.e. either we have the trivial example 
of one artifact N = 1 or we are in the pure preferential attachment limit p p = l,p r = 0. 
For the latter p(k) is therefore zero for all values of k except k = or k = E where the 
infinity cancelled by the same term in the numerator. For instance we find for small 
p r the equilibrium distribution has the form 



so only at p r = do we get condensation for any E. This represents a true phase transition 
in the large system (E — > oo, thermodynamic) limit between the gamma distribution ([35]) 
for p r > and the condensation p{k) = <5 fcjjE at p r = 0. 

At the other extreme, we have the limit of pure random artifact selection p r = 1. In 
this limit the model captures exactly the degree distribution of the original Watts and 
Strogatz model p. In this case for any E and (k) the solution for p{k) (1221) reduces to 
a binomial distribution with a probability (1/iV) of any one edge connecting to a given 
artifact vertex, i.e. we have the expected Erdos-Reyni random graph in the long time 
limit. 

4.3 Time dependence of the Degree Distribution 

So far we have looked at the equilibrium behaviour but we have a complete solution 
for the degree distribution for all times and any value of the parameters through our 
eigenf unctions (TL3]) and eigenvalues ({12]) . Alternatively for small values of E it may 
be more convenient to cast this as a matrix problem [35]. We have used the latter to 
predict the degree distribution for any time for a range of p r values either side of and 
approximately equal to the critical value in Fig.s[3], 0] and [51 These (and other figures 
below) show that the degree distribution evolves on time scales r 2 set by eigenvalue 
number two whatever p r we use (why it is not T\ is explained below). Again the exact 
mean field results fit the averaged values from a simulation extremely well. 

8 In this p r — limit we have degeneracy between eigenfunctions number zero and one. However we 
will see below that eigenfunction one a/ 1 -* does not contribute to any physical solution so there is no 
ambiguity about our solution in this case. 




(36) 



(37) 



(38) 
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Figure 3: Plots of p(k) and the fractional deviation between the simulation (data points) 
and exact mean field results (lines) for E = N = 100 and p r = 0.1 after evolving for 
t ps r 2 (crosses), t ps 2r 2 (circles), t ~ 3r 2 (stars) and to equilibrium (squares). The 
solid lines are the relevant mean field results plotted for the same times. Started with 
n(k = 1) = E and zero otherwise and simulation results averaged over 10 5 runs. 



5 The Moments of the Degree Distribution 



The properties of the hypergeometric function mean it is easy to calculate derivatives of 
the generating function at z = 1 at any time as we have 



q (m) 



with = if m > n. This then suggests that rather than work in terms of the higher 
moments (k n ), we use the probabilities F n where^l 



d n G {m Hz) 



(39) 



2 = 1 



(0) 



r(n + l) T(a + n) T{b + n) 



y ' yo T(n + l-m)T(a + m)T(b + m) 
T{c — n — m — a — b) T(c — a) T(c — b) 



x 



T(c — a — b) T(c — a — m) T(c — 6 — to) 



(to < n) (40) 



F n (t) 



T(E + l-n) d n G(z,t) 
T{E + 1) 



dz r 



E 

fc=0 



k (k — 1) (Jfe-n + l) 



n(fc,f) • (41) 



The function F n is the probability that if we choose n distinct edges, they will all share the 
same artifact. The r-th moment (k r ) can be calculated if given all the F n for n < r. The 
F n achieve their largest value only when we have a condensation, p(k) = (N — l)Sk,o + Sk,E 
where F n = 1 for all n > 2. That is we have a perfectly homogeneous population (all the 
individuals are connected to the same artifact). The lowest possible value of F n depends 
on the other parameters. If we have E < N then when all artifacts have at most one 
edge attached then F n = for all n, as we can see in Fig. at the initial time. The 
evolution causes a drift towards a more heterogeneous distribution. The same Figure 



9 This means that the generating function may be written as G(z) 
useful when solving the differential equation ©. 



Ef=o(* 



)F n , a form 
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Figure 4: Plots of p(k) and the fractional deviation between the simulation (data points) 
and exact mean field results (lines) for E = N = 100 and p r = 0.01 after evolving for 
t ps r 2 (crosses), t ~ 2r 2 (circles), t ~ 3r 2 (stars) and to equilibrium (squares). The 
solid lines are the relevant mean field results plotted for the same times. Started with 
n(k = 1) = E and zero otherwise and simulation results averaged over 10 5 runs. 



also shows how the exact mean field results match results from simulations extremely 
well. Mathematically it is clear from the result fill I) that the F n only has contributions 
from the first n + 1 eigenfunctions, i.e. from for m < n. 

In equilibrium only eigenfunction zero contributes and we have a simple result 



lim Fj 



.(*) 



AT V Pp W ' K Pp ' 

T(^(k))T(^E + n) 



(42) 



5.1 Normalisation 

The zero-th moment sets the overall normalisation of the degree distribution n(k, t). This 
is nothing but the total number of artifact nodes iV and for any time t we find it is equal 
to 



N 



t (m) 
mj i/0 



c F(a,6;c; 1) . 



(43) 



m=0 



This result is time independent because it comes only from the zero-th eigenvector, the 
only time independent part of the solution. Thus our solution is consistent with a key 
property in this model, namely the constant number of artifacts N. This then fixes the 
amplitude of the zero-th eigenfunction in (JSJ) to be 



Co 



N 
9o 



(44) 
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Figure 5: Plots of p(k) and the fractional deviation between the simulation (data points) 
and exact mean field results (lines) for E = N = 100 and p r = 0.001 after evolving for 
t ps r 2 (crosses), t ps 2r 2 (circles), t ~ 3r 2 (stars) and to equilibrium (squares). The 
solid lines are the relevant mean field results plotted for the same times. Started with 
n(k = 1) = E and zero otherwise and simulation results averaged over 10 5 runs. 
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Figure 6: Plots of various F n (t) (points) with their exact mean field predictions (lines). 
From top to bottom we have: i^(^) (crosses), F^(t) (circles), F^t) (stars). For E = N = 
100, p r = 0.01 and data points are the average of 10 5 runs of a simulation. 

5.2 Average Degree 

The first derivative of the generating function gives the number of edges 

d 



E 



dz 

E 



G(z,t) 



2 = 1 



E/\ \t (m) c 



m=0 



c %(°) j_ Cl (\ 



(45) 
(46) 



Only eigenfunctions zero and one contribute but the latter leads to time-dependence. On 
the other hand we also have a fixed number of edges in this model as it is one of our 
input parameters. The only solution is therefore C\ = 0. Thus for any physical problem 
there is no contribution from the eigenfunction number one. 

This equation then appears to over constrain our solution as Co is already known from 
the normalisation (144p and all other quantities are fixed. However, we find that using 
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Figure 7: Plots of the Homogeneity Factor -F 2 (£) and the fractional difference between 
the simulation (data points) and exact mean field results (lines) for N = E = 100 and 
different p r . From bottom to top: p r = 0.1 (crosses), p r = 0.04 (circles), p r = 0.02 
(stars) and p r = 0.01 (squares). Initial configuration is n(k = 1) = E and zero otherwise. 
Simulation data is averaged over 10 4 runs. The results are in good agreement with the 
analytic result equation (|52|) . 



standard properties of hypergeometric functions and the normalisation from (1431) that 
the solution already satisfies f|4"5l) and again everything is consistent. 



5.3 Homogeneity Measures F n and Initial Conditions 

The next derivative of the generating function contains the second moment but it is 
preferable to work with our related function F 2 - the probability that two distinct edges 
chosen at random are connected to the same artifact. Similar measures of the homogeneity 
of the artifact choices have been used before such as F = {{k 2 / E 2 )) but this is easily 
calculated from our F 2 measure. We find that 



k=0 



E(E-l) 



E(E-l) 



cosf + c 2 (A 2 M 



U 2 ) 



(47) 
(48) 



Now there is time dependence but only coming from eigenfunction number two. This 
function is readily evaluated using (T40l . the coefficient c 2 fixed upon specification of the 
initial conditions. This formula fits the results extremely well as Fig. [7] shows. 

One of the advantages of the F n measures is that they provide a systematic and 
practical way of fixing the amplitudes of each eigenfunction, the c m coefficients of ([8]), 
from the initial conditions. From the definition (1411) we can express F n in terms of the n-th 
derivatives of the generating functions associated with the m-th eigenfunction evaluated 
at z = 1, i.e. qi m) of (1391). 



T(E+1- 

r(E + r 



J2c m g^ = F n (t = 0). 



(49) 



m=0 
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However only the first n eigenf unctions contribute so we can use an iterative scheme 
to find the first few coefficients quickly. These are sufficient to provide an excellent 
approximation for the degree distribution for most times. 

For example consider the case of uniform initial conditions, such that E < N and 
each artifact is connected to at most one edge, then we have that F n = for n > 2. This 
corresponds to the choice of initial conditions used in obtaining the numerical results in 
Figures [2] though [5j So for these initial conditions the following condition holds, 



Cm9i m) = 0, n> 2. (50) 



£ 

We have already seen that the N parameter fixes Co in (1441) while the first moment or 
equivalently E gives C\ = 0. So starting with n = 2 we have 

(o) 

c 2 = -co^y- (51) 

92 

The exact time dependence of the second Homogeneity function is 

F 2 (t) = (1 - A|)F 2 (oo) 

= (1"A^4^ (52) 
p p + p r E 

Comparisons to numerical results are plotted in Fig. [7J 

Another particularly convenient choice of initial conditions is to attach each individual 
vertex to the same artifact vertex so that n(k = E) = 1, n(k = 0) = iV — 1 and zero 
otherwise. Now F n (0) = 1 and the condition (|4"§|) becomes 

2^C m9n - r , E + 1 _ n y W 

m=0 v ' 

Note, we have put no restriction on the total number of individual vertices, E. For the 
simplest case n = 2 we are led to another simple formula, 

= - A ^ (VtB - + l (54) 

While the exact degree distribution requires knowledge of all the eigenfunctions, the 
low n eigenfunctions still provide a suitable approximation for most time. Fig.s [8] and 
[9] illustrate this, showing the contributions to the degree distribution from successive 
eigenvectors for the two initial conditions discussed above. 
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Figure 8: Plots of the evolution of p(0), p(50) and p(100) for £ = JV = 100, p r = 0.01 
compared with the relevant mean field results. The solid lines represent the exact mean 
field solution, while the numbered dashed lines indicate the successive improvements 
obtained using contributions from A2 (2) up to A5 (5). Simulations started with n(k = 
1) = E and zero otherwise. Averaged over 10 5 runs. 




Figure 9: Plots of the evolution of p(0), p(50) and p(100) for E = N = 100, p r = 0.01 
compared with the relevant mean field results. The solid lines represent the exact mean 
field solution, while the numbered dashed lines indicate the successive improvements 
obtained using contributions from A2 (2) up to A5 (5). Simulations started with the 
alternate initial condition n(k = E) = 1. Averaged over 10 6 runs. 



16 



6 Discussion of other Models 



The bipartite network of FigJT] represents relationships at the core of many models in 
the literature, some of which are not usually expressed in terms of networks. While the 
models considered elsewhere often have additional elements compared with our simple 
model of Sec. [21 those models often contain special cases where the degree distributions 
for any time will be given by our exact result. The aim of this section is to indicate 
the relationship between our model and those found elsewhere. Only some of these 
connections have been made before and then only in some of the literature. We will start 
by considering some generalisations of our simple model as this will then help us to make 
comparisons with previous work. 



6.1 Generalisations of our bipartite model 

Many related models work in continuous time so that the number of rewiring events 
which have occurred corresponds to our discrete time variable. However it is easy to 
recast our simple model as a continuous time process so that n(k, t + 1) — n{k, t) on the 
left hand side of our master equation flTJ becomes dn(k,t)/dt with the ITr and Ua being 
interpreted as rates. This case is just as easy to solve as we replace the form used before 
for our degree distribution and generating function (jSJ) by 

E 

n{k, t) = Y^ c m u {m \k) exp{-A m t} , A m = 1 - A m . (55) 

m=0 

The eigenfunctions k/ m )(fc) and the associated generating functions G^ m ' are exactly as 
before, the new eigenvalues A m have a simple relationship to our original A m and the form 
of the time dependence is altered. Thus our exact solutions may be applied to discrete 
or continuous time. 

Another obvious generalisation of our model is to alter the form of the attachment 
and removal probabilities and (j3J. Suppose 

n *« - (56) 

n*(*) = 4 +P „<l^M + P I, (57) 

with q p + q a = 1 and p p +p a +p r = 1. We have added an extra process to our model where 
with probability p a (q a ) we can attach (remove) an edge from a random artifact chosen 
uniformly from those which have at least one edge. The number of such active artifacts, 
those where k > 0, is denoted N a (t) and this is time- dependent. However the master 
equation ([1]) will no longer be exact because N a varies from configuration to configuration 
so averages of ratios of k m n(k) and N a will not factorise into the ratio of their averages 
([2j). The time dependence of N a (t) also makes the non-equilibrium solutions of the master 
equation hard to find though the equilibrium solution can be found as before [37]. For 
instance the slope of the power law section in the non-condensed phase is now 

7 = 1 - ( - + - - -) (k) (58) 
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and it can now be greater than one. In terms of the condensed phase, this now occurs 
when p p > q p which means that there is now a range of parameter values which lead to 
this phase for large networks. 

Another obvious generalisation is to have terms in 11^ or IIr proportional to general 
powers of the degree (k^/zp) or powers of general functions (a + bk) 13 . As noted in Section 
[2] this means that the master equation (TjQ) is then only an approximation though in many 
cases it will be a good one. 

An important aspect of our model is that we have events where an edge is rewired 
back to the same artifact so that the configuration does not change. We had to include 
the (1 — 11) factors to account for this correctly. If we wish we can exclude these events 
which corresponds to choosing an attachment probability of the form 



where we are removing an edge from artifact labelled d (the departure artifact) and 
adding it to an artifact labelled a (the arrival artifact). The (1 — II) factors in the master 
equation (0Q) are now always one and can be dropped, giving the master equation the 
form often seen in the literature (e.g. in 0, E| El [7J [HI EJ E]). However the preferential 
attachment term p p of the attachment probability IT^ now has a configuration dependent 
normalisation. The mean-field master equation is now an approximation for all p p > 
and it is hard to solve it for arbitrary times. In many cases the fluctuations will be small 
and the mean field will be a good approximation. Further if the number of edges attached 
to any one artifact is small (tends to zero in the large E limit) then the difference between 
our model and one excluding a = d events will be small [3j. Unfortunately, this will not be 
true in the interesting case where we have a condensation since for some artifact vertices 
k a /E will be significant, finite even in the large E limit. We would then expect differences 
between processes based on (jlj) and (1591) . 

Ultimately we could make the attachment or removal rates depend on the individual 
nature of each vertex, e.g. make the probabilities p r and p p vary with artifact vertices. 
This could mimic 'fitness' where some artifacts are intrinsically more likely to attract 
edges. 

One realistic way that artifact fitness could emerge is through the addition of an 
Artifact graph. That is we could add a second network which connects artifacts to 
artifacts and this could be used in choosing how the bipartite graph is rewired. For 
instance, suppose we have chosen the edge we are going to rewire, so that we know the 
departure artifact. We could choose the arrival artifact by making a random walk on the 
artifact graph starting from the departure artifact J36JH3]. In this way the artifacts with 
high degree in the artifact graph would be preferred (even for a walk of one step) and 
a natural fitness assignment for artifacts has emerged. Alternatively, we could view this 
Artifact graph as a way of encoding some distance metric on the artifact space. That 
is when choosing a random artifact, a p r event, it may be that a small variation in the 
artifact, as defined by some metric, is more likely than a large one. Our simple model 
is equivalent to having a complete graph with tadpoles^) for the Artifact graph (the 

10 A tadpole in a general graph is an edge starting and ending at the same vertex. For example see 
vertex A in FigfTPl 





(59) 



18 



adjacency matrix is one for all entries) which we use for the random choice (p r ) events. 
The variation mentioned above, where reconnection to the same artifact is excluded, 
corresponds to a complete Artifact graph with no tadpoles. 

At the moment, our preferential attachment process, p p , has been put in by hand. 
However this can emerge naturally if we add an Individual graph, one which just connects 
the individual vertices. Suppose we have chosen the individual whose edge is to be 
rewired. We now make a random walk on the Individual graph and arrive at an individual 
vertex which is connected to what is now taken to be the arrival artifact for the rewiring 
process. Even a short walk of this type produces good approximations to preferential 
attachment processes [361 HU, H21 H3] . The preferential attachment events in our simple 
model are equivalent to doing a random walk on an Individual graph which is a complete 
graph with tadpoles. 

6.2 Relationship to models in the literature 

The rewiring of unipartite networks has been studied in its own right [H [2j [U [3j El EJ [7J [101 
ITT] but all of these examples contain, in some sense, our bipartite graph. A projection of 
our bipartite graph gives an unipartite graph, made from just the artifact vertices. One 
way to achieve this example is to pair the individual vertices (say individuals numbered 
(2i — 1) with (2i)) and to consider the two edges of these individual vertices as the two 
ends (the stubs) of a single edge in the new undirected graph. Thus the process our 
simple model illustrated in Fig. [JJ represents a rewiring process in some undirected graph 
as shown in Fig. [TU1 In this way, or by considering the problem directly, we see that the 
mean field equations (pQ) are the same and we need only alter the normalisations in the 
probabilities (0J. 



N Vertices 




(1,2) (3,4) 

E/2 Edges 



Figure 10: How the rewiring of the bipartite graph represents the rewiring of an undirected 
unipartite graph. In this example individual vertices numbers (2i) and (2i — 1) are paired 
in the bipartite graph to give the edge labelled (2z — l,2z) in the equivalent undirected 
graph. The rewiring of the undirected graph depicted in this figure is is equivalent to 
that shown for the bipartite graph rewiring of Fig. [TJ This is the projection used by 
Molloy and Reed [33]. 

Note that the degree distribution of the projected undirected graph at any one time is 
independent of how we pair off individual vertices in the bipartite graph. Thus the degree 
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distributions of many different unipartite networks is represented by the same bipartite 
graph. Indeed this is the same projection used by Molloy and Reed to construct general 
random graphs, that is graphs of a given degree distribution but otherwise arbitrary [39] . 

There are several expressions for global properties of large generalised random graphs 
which depend on the ratio of the second and first moments through a parameter z [391 

HUSSIES! 

z(t):= { j^-l = (E-l)F 2 (t) (60) 

Thus for large general random graphs being rewired using any mixture of random vertex 
and preferential attachment, we can give these global properties at any time. For instance 
the mean inter vertex distance £ scales as (ln(iV)/ln(z(i) + const) so we see from (152]) 
that for large E we only avoid i scaling as ln(iV) if p r E ~ 0(1). 

Similarly a GCC (giant connected component) is present in this unipartite projection 
when z > 1. We see that this will always appear if (k) > 1 or, if (k) < 1, it appears 
only if p r < (2 — (A;)) -1 . Suppose we start from the most disconnected example where 
F 2 (t = 0) = (so (k) < 1). Using ([52} we can find the time at which the GCC 
first appears. If p r E ~ 0(1), which includes the condensate region, we find that the 
GCC appears at t = E/2. This is much quicker than the approach to the equilibrium 
configuration which happens on a time scale r 2 ~ 0(E 2 ). If p r is raised from 0{E^ 1 ) 
towards the critical value for the existence of a GCC, (2 — (A;)) -1 , the time at which the 
GCC appears increases, reaching infinity at the critical value of p r . 

A different example of this projection is when our initial bipartite graph has each 
artifact connected to m individuals {n{k) = N5k m ). The unipartite graph projection 
is then a randomised version of the graphs used by Watts and Strogatz [1J. From this 
initial condition and setting p r = 1 we therefore have the exact solution for the degree 
distribution at any time in the Watts and Strogatz model. The pairwise correlation of 
individual vertices is only required if we want to know about other aspects of the Watts 
and Strogatz networks, such as the network distance and clustering coefficients which were 
the focus of P]. We can though calculate such quantities at any time in the randomised 
graph which provides a useful comparison. 

It is straightforward to adapt this projection so that we get a directed graph. For 
instance the direction of an edge in the projected unipartite graph could flow from the 
artifact connected to individual (2i — 1) to the artifact connected to individual (2i). A 
simple modification of the master equation ([T]) is needed to keep track of the in- and 
out-degree if we choose to make these edges directional. 

One can also think of other types of projection onto unipartite graphs. Suppose 
one fuses each individual vertex i with an artifact vertex which is not necessarily the 
artifact connected to that individual by the individual's edge in the bipartite graph. 
The individual-artifact edges of the bipartite graph now represent edges between the 
fused vertices of this projected unipartite graph. There is a natural direction associated 
to these unipartite edges coming from the individual-artifact direction of the bipartite 
graph, and this can be maintained or ignored as needed. A simple example of this 
projection is where the numbers of individual and artifact vertices are the same and we 
fuse each artifact vertex with one individual vertex. If we let the edges of the bipartite 
graph represent edges from the individual to an artifact in the unipartite graph, then the 
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unipartite graph vertices have out-degree equal to one and this is one way of representing 
the degree distribution of the graphs of j5] and as illustrated in Fig. [TTJ Our master 
equation ([!]) is then the exact mean field description for the in-degree in this case. 




Figure 11: Another type of projection from our bipartite to a unipartite network. Each 
individual vertex is fused with one of the N artifact vertices to produce a unipartite graph 
with iV vertices. The edges of the unipartite graph are naturally directional coming from 
the vertex associated with the old individual vertex of the bipartite graph and going to 
the old artifact vertex. In the simplest case we have the same number of artifacts E as 
individuals N and then we fuse artifact A with individual 1 to give a unipartite vertex 
labelled Al etc. This produces a network of the type used in [5]. The figure shown here 
is this simple projection of the bipartite graph and rewiring event of Fig. [TJ 

We will now turn to problems where there is no explicit reference to a network in the 
standard exposition but which can still be related to our model. In such cases there is an 
implicit graph in the problem which one may define to make contact with our realisation, 
but this network may not be relevant in these other studies. 

The work on cultural transmission [2Ql ED [22], [231 1211 [25] is usually developed without 
reference to any network. The names for our vertices come from this case. In this 
context individuals are deemed to be choosing some artifact of no particular value (pottery 
designs, pedigree dog breeds or baby names for example) by copying the choice of another 
individual — preferential attachment. Sometimes though one can expect innovations to 
be made when a completely new artifact is introduced. This translates to a random 
attachment event in the iV — > oo limit. While this work does not generally use a network 
picture, it does translate directly into our network model (e.g. see [21]). Here the edges 
in our network realisation represents the artifacts chosen by each individual. In cultural 
transmission problems, samples of these distributions are often available, from records of 
births, pedigree dog registrations, or reports from archaeological excavations^. 

It is relatively easy to see how the same model may be used for family names rather 
than the personal names of [22] • In this case the partners who change their family name 
are represented by the individual vertices, the family names are the artifact vertices and 
the edges represent the partners who keep their family name. This is then the constant 
population limit of the models in [26J. 

This family name example shows that this model may be linked to inheritance pro- 
cesses. As noted elsewhere [211 E21 [23], El] the oldest examples come from a simple model 

n Sometimes these samples are medium term time averages of the degree distribution J p(k,t) and 
such distributions may take a different form, a problem addressed in |221 123] . 
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for the diversity of genes in a constant population due to Kimura and Crow [271 128] . In 
the case of a haploid cell (viruses, bacteria and blue-green algae provide examples) the ar- 
tifacts are alleles of a single gene carried by each individual. The preferential attachment 
events correspond to inheritance of genes. This produces a drift towards homogeneity 
and, if unchecked, a condensation or fixation in the frequency of alleles in the population. 
The random attachment process is mutation in this context. 

In general a fitness is assigned to each gene representing the chance of survival and 
the successful birth rate associated with having that gene. There may also be different 
mutation rates associated with each gene. However in the simplest models such factors are 
ignored. Translating the haploid gene model for a constant population into the language 
of our network rewiring model is then simple. The organisms are the individuals and 
we consider one gene carried by each individual. Each different allele of this gene is a 
distinct artifact vertex and so each edge records the allele carried by an individual. The 
rewiring example of Fig. [1] is translated into a haploid model as shown in Fig. [121 



N Alleles 




E Individuals 

Figure 12: Interpretation of the the example shown in Fig. [T]as a haploid gene inheritance 
and mutation model. Each individual carries one copy of a gene and each different version 
of the gene, an allele, is represented by an artifact vertex. The edges indicate the allele 
present in each individual. Note this also serves as a model of family names for a constant 
population if one partner inherits the family name of the other partner. In this case the 
alleles (artifacts) are the family names, the edges are the males and the genes (individuals) 
are the females. 

In a diploid cell there are two copies of a gene and most cells of most higher organisms 
are of this type. Ignoring fitness etc. we can see that we can represent the allele frequencies 
of one gene in a constant population of diploid cells with the usual rewiring model as 
shown in Fig. [131 

There is also a close relationship between our network model and various models of 
statistical physics, a connection already noted in some places [HI [10]. In the original 
Urn model of the Ehrenfests [17] one has E balls placed in two Urns. At random times 
given by a Poisson process, a ball chosen at random is moved from one urn to another. 
This corresponds to a continuous time version of our model with the artifacts being the 
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Figure 13: Interpretation of our model as a diploid gene inheritance and mutation model. 
Organism (3',4') dies and is replaced by (3,4) whose parents are the organisms (1,2) and 
(5,6). From the (1,2) parent it inherits copy number 2 of the gene which is allele A. 
However the gene it inherits from the (5,6) parent mutates to allele N. 

Urns so N = 2 and the individuals represent the balls. Choosing p r = 1 in our model 
reproduces the behaviour of the original Urn model. 

There is one subtlety in that in the original Urn model the ball is never put back into 
the Urn it was drawn from. These events are allowed in our model and are precisely the 
ones which require the factors of (1 — 11) in our master equation (TjQ) as they leave the 
configuration unchanged. The difference between the original Urn model and our model 
for N = 2, p r = 1 and continuous time is just a matter of a factor of two in the rates. 

However generic Urn models are often encountered in some obvious variations of the 
Ehrenfest version, in particular with N urns and with different forms for the rate at which 
balls are moved and where they are then placed [S], El QUI EH] ■ Some of these variations of 
the original Urn model are equivalent to other models such as the Backgammon or Balls- 
In-Box models used for glasses [131 E]> simplicial gravity [15] and wealth distributions 
[T6] . The zero range processes [T7J HH [19] can also be interpreted as Urn models, with 
the 'misanthrope' process on a fully connected geometry being closest to our basic model. 

Using the terminology of the Urn model review [9J, the 'geometry' of the Urn model 
refers to which Urns are connected — an artifact network in our model as discussed in 
Section 16.11 The simplest 'mean-field' geometry, i.e. a complete graph for the artifact 
network, is what we assume in our simple model. On the other hand the basic zero-range 
process models [T71 [T8] use a one- dimensional ring. If we allow processes where the ball 
is placed back into the urn it came from, then the rate at which a ball moves is given by 
u(d, a) per ball where d is the departure urn and a the arrival urn. Usually the rates used 
factorise into two terms, one depending only on the number of balls in the departure urn 
kd (number of edges, i.e. the artifact vertex degree) and the other on the number of balls 
in the arrival urn k a . In our terminology u(d,a) = U.n(kd)llA(k a ) . Then the three rules 
for ball selection discussed in [8j [9J correspond to our generalisation (ISTj) as follows: Rule 
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A (random ball to random urn, Ehrenfest class) is our q p = 1, p r = 1; Rule B (random 
urn to random urn, Monkey class) is our q a = 1, p r = 1; Rule C (random ball to random 
ball) is our q p = 1, p p = 1. 

However we stress that to include processes where balls are returned to the urn they 
were drawn from, the master equation has to contain the factors (1 —11) while such terms 
are normally absent in the evolution equations of literature on Urn and related models 
e.g. in [HI El UU] - If we were to exclude such events then our transition rates u(d, a) will not 
factorise into departure and arrival dependent termg[jf|. For instance in our language the 
factor normally associated with the arrival vertex, our attachment probabilities n^, will 
have to depend on the departure urn as well as on the properties of the arrival urn, and 
for us would take the form (159]) while the literature usually uses simple factorisable forms 
e.g. [SI El QUI EEE]- As we noted in Section I6TT1 this will be important when a significant 
fraction of balls are in any one box, as is the case with a condensate. 

There is a way round this problem and that is to work with our solution in continuous 
time (I55I) and then to rescale our time t back into the time t urn of an Urn model where 
one can not put the ball back into the urn it was drawn from. From the number of these 
events allowed in our model but excluded from an Urn model we have for infinitesimal 
time steps 

~Pr Pp(k 2 }' 



dt - dt, 



dt (61) 



N E(k) _ 

where the second moment (k 2 ) is easily derived from i*2(0 of 

Finally we note that many models in sociophysics may be cast as generalisations of our 
bipartite rewiring model. If we add an Individual graph then for a copying (preferential 
attachment) event, an individual copies the artifact choice made by one of its neighbours 
in the Individual graph. When p p = 1 and N = 2 this is the basic Voter model [29] , 
as used for instance for language evolution [31]. Our results are equivalent to having an 
Individual graph which is complete with tadpoles^!. Our time scale is fa/E) = E/2 
which agrees with the 0(E) result quoted in [30]. However our result shows the effect on 
both the consensus and on the time scale to reach equilibrium of adding some randomness 
to such Voter models. 

Our results may also be useful in other sociophysics models. In one variation of the 
Minority Game [32] individuals choose the 'best' strategy known to them, comparing 
their own against all those used by their neighbours as defined by an Individual graph. 
Each artifact vertex in our model would then represent a different strategy. In this case 
what is best is continually changing as generally the more popular one strategy becomes 
the less successful it will be. Thus statistically, it is likely that the resulting instantaneous 
artifact degree distribution n(k,t) will be indistinguishable from that obtained by just 
copying the artifact of a random neighbour which as a simple random walk is likely to 
lead to effective preferential attachment. It is no surprise then that the long time results 
for the popularity of strategies in [32J follows a simple inverse power law with a large 



12 One exception is for a two Urn model as then the number of balls in the arrival urn may always be 
written in terms of those in the departure Urn and vice versa. This means typical expressions for rates 
are always factorisable. 

13 If this is performed on just a complete graph then this is an TV = 2 Urn model with rule C of [8, 9 
performed on mean field Urn geometry. 
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degree cutoff, the form found in ( |35i) . 



7 Summary and Conclusions 

The starting point of our work is the observation that the usual mean field master equa- 
tions seen for network evolution are not suitable for general rewiring problems. One 
needs to add the factors of (1 — 11) seen in (TTJ) if the degree distribution is to behave 
properly at the maximum degree |37j . With these terms and the simplest case of linear 
attachment/removal probabilities the exact solution for the degree distribution at any 
time can be found for arbitrary values of the parameters, here expressed in terms of the 
generating function G(z, t) of ^ and ([IT]) . This is better than can be done for simple 
growing networks where the exact equilibrium solution is known for simple attachment 
probabilities but the finite size (finite time) system corrections are only known asymp- 
totically (e.g. see [331 EH [36]). Previous results for equivalent models give results that 
are only approximations, often for infinitely large systems in equilibrium though all are 
consistent with the results derived here^l. We have also compared our analytic results 
against numerical simulation in several ways and seen that agreement is excellent. We 
know of no other network model that has the exact time dependent solution for arbitrary 
parameters range and suggest that this model may prove to be as useful a model as the 
Erdos-Reyni random graph has been. 

In particular the equilibrium degree distribution of [37J is found as the long time 
solution. It has two characteristic regimes: if when all edges have been rewired once (on 
average) at least one rewiring was done randomly then a simple inverse power law with 
exponential cutoff is obtained, otherwise we have a regime with a condensate. 

We have confirmed the slow approach to equilibrium and the conjectured form for the 
second eigenvalue Ai of [38J. However here we have shown that the long time evolution is 
governed not by the second largest eigenvalue but the third largest, A 2 = 1 — (2p r /E) — 
(2p p / E 2 ) with associated time scale r 2 = — l/ln(A 2 ). 

We have also noted that this simple bipartite graph rewiring model captures the degree 
distribution of many other networks, with that of the original Watts and Strogatz model 
[lj as one limit of our model. In particular we have the exact degree distribution at any 
time and any parameter value for the rewiring of a general random graph. From this 
we can obtain various global properties analytically as a function of time using various 
known formulae [391 HH S3 HE]. However many of the alternative realisations require 
no explicit network as in the link to Urn/Backgammon/Balls-in-Boxes models and zero 
range processes [10l[EIl[E2l[E3l[E2[Eg[Eg[^ 

The model also has a wide range of practical applications. As most practical systems 
can not grow indefinitely, this fixed sized rewiring model will often be more appropriate 
than a growing network model. The Urn- type models have been applied to glasses [T3] , 
simplicical gravity [15] and wealth distributions [16]. Models for social science in both 
modern and archaeological contexts [201 [221 I2H [23] can be be cast as our model. 
The applications in these papers include baby name frequencies [22], pedigree dog breed 

14 The literature does however tackle other more complicated variations of the model not considered 
here. 
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popularity [23] and pottery styles [201 ET]. Sociophysics models may also be related to our 
work. The Voter model [291 E0] , as applied to language evolution [31] , and the choice of 
strategy in a Minority game variant [32], may be linked to our bipartite graph approach. 
Finally basic models of population genetics [271 12H] and more generally any process where 
inheritance is important, such as with Family names [26], can be encoded by our network 
rewiring. 

Many of the related examples in the literature also study cases beyond our simple 
model, for example non-linear attachment probabilities IT^ oc k 13 for /3 6 R, attachment 
probabilities whose scale varies with the artifact (fitness), pure Artifact or pure Individual 
graphs, and growing systems (dE/dt) ^ 0. These can often be captured by extensions to 
our basic model but the downside of this sophistication is that the mean field equations 
are then only approximations whose exact algebraic solution is probably unobtainable in 
any case. Rather the literature usually works in a large network long time approximation, 
E 3> 1, often in a particular part of parameter space such as 1 ^> p r 3> E^ 1 or p r = 0. 
We though have exploited the simplicity of our model in order to obtain exact solutions 
for any time or parameter value. 

At worst this simple bipartite model provides a useful null model against which to 
test other hypotheses [23]. However we have also argued in Section IBTT1 why copying may 
be a more widespread method than the obvious cases involving inheritance mechanisms. 

Finally we note the scaling properties of the model. In many practical examples the 
artifacts are really categories imposed by investigators on a collection of objects. In 
almost all cases, each object could be individually identified if one wishes. Indeed the 
objects may be being chosen by individuals based on characteristics completely different 
from those recorded by the researcher. No pedigree dog [23] is genetically pure, a personal 
[22] or family name [26] may come in several close variations, and who assigns a particular 
style to an archaeological pottery find [201 121] ? 

Consider an exemplary small study [H] where the shoes of around two hundred male 
physics students leaving a lecture were photographed. Various researchers categorised 
them in completely different ways giving different degree distributions from the same 
data. For instance one could categorise each shoe by colour, material and fastening 
method. Still what constitutes say a 'blue' shoe may be a context dependent matter 
of physical and social perception so researchers and wearers may not even agree how to 
classify a given shoe under the one scheme. 

So if such artifact popularity distributions are to have much meaning they ought to be 
largely independent of this categorisation. Thus consider pairing the artifacts at random 
and calculating the degree distribution for these pairs, n2(k), even though the model 
continues to make its rewiring selections based on the original single artifact vertices. 
That is at each event we choose to make a preferential (copying, inheritance) attachment 
or a random (innovation, mutation) attachment to the artifact pairs with exactly the same 
probability p v and p r . Given our linear attachment/removal probabilities the effective 
probability for attaching to a given artifact pair is just the sum of the degrees of its 
constituent artifacts, i.e. it is still proportional to the degree of the artifact pair. On the 
other hand the probability we attach to a given artifact chosen at random is halved but 
only because the number of artifact pairs N 2 is just half the original number of artifacts, 
N — > N2 = N/2. With the number of edges E unchanged, changing N to N2 is the 
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only change we need to make in our equations. Vitally, the form for the attachment and 
removal probabilities remains the same and thus the form of the solutions is unchanged. 
We will get the same qualitative behaviour, a power law with an exponential cutoff. In 
such cases the cutoff ( (128]) remains unchanged and only the slope 7 (1261) changes in 
comparing n(k) to the pair degree distribution n2{k). However the slope is invariably 
indistinguishable from one in a practical data set or it will be unmeasurable with a small 
cutoff (. Thus for a linear attachment plus random attachment model, the distribution 
of artifact choice is independent of how artifacts are classified for all practical purposes. 
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